GENERALIZED SMIRNOV STATISTICS AND THE DISTRIBUTION OF 

PRIME FACTORS 



KEVIN FORD 

Dedicated to Jean-Marc Deshouillers on the occasion of his 60th birthday 



Abstract. We apply recent bounds of the author for generahzed Smhnov statistics to the 
distribution of integers whose prime factors satisfy certain systems of inequahties. 



1. Introduction 

For a positive integer n, denote by pi < ^2 < ■ • • < Paj(n) the sequence of distinct prime 
factors of n. In this note, we study integers for which 

(1.1) log2Pj>aj-/3 {l<3<uj{n)) 
or 

(1.2) log2P,<aj+/5 (l<j<u;(n)). 



where a > and \0g2y denotes log logy. The distribution of integers satisfying J l 1.1 p is 
important in the study of the distribution of divisors of integers (see j^]; Ch. 2 of [1]). We 
present here estimates for 

Mk{x; a, 13) = #{n < x : uj{n) = k, (fT^ }. 

It is a relatively simple matter, at least heuristically, to reduce the estimation of Nk{x; a, (3) 
and Mfc(x; a, (3) to the estimation of a certain probability connected to Kolmogorov-Smirnov 
statistics. Let us focus on the upper bound for Nk{x; a, j3). If we suppose that pk > x'^ 
for some small c, then for each choice of {pi, . . . ,pk-i), the number of possible pk is <^ 
x/{pi ■ ■ ■ pk-ilogx). Since J2p<y^/P ~ log2l/, given a well-behaved function /, by partial 
summation we anticipate that 



fi'- 



' log2 Pl log2 Pfc- 

:i.3) E ^ V'og.^'---' log., y ^(iog^,).-i /.../ ;(^),^, 

^ ^ P\---Pk~\ J J 

Pi<-<Pk^i<x o<6<-<a-i<i 



where ^ = (^1, . . . ,6_i). 

Let Ui, . . . , Um be independent, uniformly distributed random variables in [0, 1] and let 
^1, . . . be their order statistics (^1 is the smallest of the Ui, ^2 is the next smallest, etc.). 
Taking m = k — 1, the right side of (11.31) is equal to (log2 xY~^ / {k — l)\ times the expectation 
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of /(^i, . . . ,^fc-i)- Letting / be 1 if (11 .ip holds and otherwise, the expectation of / is the 
probabihty that C,j > {aj — 13)/ logg x for each j. 

In general, let Qm{u,v) be the probability that > ^ for 1 < z < m. Equivalently, if 
u > then 

/ yf -|- 11 

QUu, v) = Prob F^t) < (0 < t < i; 

\ m 

where Fm{t) = ■^J2ui<t^ associated empirical distribution function. The first esti- 

mates for Qm{u, v) were given in 1939 by N. V. Smirnov jl], who proved for each fixed A > 
the asymptotic formula 

(1.4) Qm{XVm,m) ^ 1 - e'"^^^ (m -> cx)). 

The sharpest and most general bounds are due to the author see also [H- For convenience, 
write w = u + V — m. Uniformly in u > 0, w > and m > 1, we have 

(1.5) Q™(m, v) = 1- + o (m:^] . 
Moreover, 

(1.6) Q„i{u,v) min 1 1, — (m>1,u'>1). 

V m / 

See for more information about the history of such bounds and techniques for proving 
them. A short proof of weaker bounds is given in §11 of 

Returning to our heuristic estimation of Nk{x) (and assuming that a similar lower bound 
holds), we find that 

a;(log2 a;)*^^"*^ f (3 log2 x 



We have (cf. Theorem 4 in §11.6.1 of [6| 

(1.7) rtk{x) := #{n < a; : uj{n) = k} - ^^^"^^ ^^'^ ' 



-A 



(A; — 1)! logx 

uniformly for 1 < A; < y41og2a;, A being any fixed positive constant. Thus, we anticipate 



Nk{x; a, /3) x Q^^i ( -, — — ) TCk{x). 



that 

Nkix]a,f3) ^ Qk^i I 

a 

Observing that the vectors (,^1, . . . ,^m) and (1 — ^m, 1 — C,m-i, ■ ■ ■ A ~ ^1) have identical 
distributions, we have 

^/ T-.i/'^ u + v — m— 1 + i , 
Qm{u, V) = Prob 4i < (1 < « < m 

Hence, we likewise anticipate that 

Mk{x; a, (3) x Qk-i k H , TTk{x). 

\ a a J 

To make our heuristics rigorous, we must impose some conditions on a and (3 to ensure 
among other things that there are integers satisfying (11. ip or (II. 2p . To that end, we set 

(1.8) u = —, v = , w = u + v — [k — 1) = k + 1 

a a a 
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for the estimation of Nk{x] a, 13) and 

(1.9) u = k -\ , v = , w = u + v—[k — l) = hi 

a a a 

for the estimation of Mk{x; a, [3). 

Theorem 1. Suppose e > A>1 and I < k < AfoggX. Assume (II. Sp . P > 0, a — P < A, 

w > 1 + e and 

(1.10) e"(^-^) - e"^'"-^) > 1 + £. 
Then, for sufficiently large x, depending on e and A, 

nr / n\ ■ A + / \ 

Nk[x] a, (3) ^e,A mm I 1, 1 7rfc(x), 

the implied constants depending only on e and A. 

Theorem 2. Suppose A>1 and 1 < k < 74fog2X. Assume (II. 9p . u > 1, w > and that 
for 1 < j < k, there are at least j primes < exp exp{aj + (3). Then, for sufficiently large x, 
depending on A, 

Mk{x; a, [3) min |^1, -^—^—^ T^k{x), 
the implied constants depending only on A. 

Remarks. Inequahty (ll.lOp is necessary, since for large fc, (II. ip imphes 

k k Oik f3 1 

\ogn > y fogp,- > y e-^-^ ^ = , , 

J=l 3=1 

The condition a — (3 < A in Theorem [1] means that there is no significant restriction on pi. 

It is a simple matter to apply the estimates for Nk{x;a,P) and M^^x; a, (3) to problems 
of the distribution of prime factors of integers where uj{n) is not fixed. In the following, let 
uj{n,t) be the number of distinct prime factors of n which are < t. It is well-known (cf. Ch. 
1 of [4]) that uj{n,t) has normal order log2t. We estimate below the likelihood that uj{n,t) 
does not stray too far from log2 1 in one direction. 



Corollary 1. Uniformly for large x and < P < ^^/\og^, we have 

(0 + l)x 

(1.11) #{n < X : Vt,2 < t < x,uj{n,t) < max(0, loga t + /?)} x ^ 

V log2 X 

and 

{P+l)x 



(1.12) #{n < X : Vt,2 < t < x,uj{n,t) > log^t - P} 



\Aog: 



X 



Proof of CorollaryUl The quantity of the left side of (11.111) is J2k ^k{x; 1, P). Here u = P, 
V = log2 X and w = log2 x + P — k + 1. By Theorem [1] and (II. 7p . 

{P + l)x 



J2 iV,(x;l,/5)» 



log2 a;-2y^log^<fc<log2 x-^y\og2 x 



\/log2 X 
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since Trk{x) x x/ \/\og2 x for \k — log2 x\ < 2A/log2 x. This proves the lower bound in (11. lip . 
For the upper bound, we note that if k > log2 x+P, then Nk{x; 1, /3) = 0. Hence, by Theorem 
□ and ffLTD . 

EAT / . ^\ W + f)i^og2X + (3 - k + 1) , , , , 

iVfc(x;l,/3)< ^ k -'^k{x)+ 2^ TTkix) 

k fc<log2 a;+/3-2 logj a;+/3-2<fc<log2 

V log2 X 

This proves the upper bound in (11. lip . 

The quantity on the left side of (11.120 is ^^Mfc(x;l,/5 — 1). Here v = log2X, u = 
P + k — log2 X and w = (3. By Theorem [21 

Y: M.(x;l,/3-l)»i^iil^ 
/ , V 1°S2 

log2 a:+ Y log2 a;<fc<log2 a:+2^1og2 x 

proving the lower bound in (I1.12p . Also by Theorem [2l 

Y M,(x;l,/?-l)«^^±il^. 

log2X-/3+l<fc<101og2X' V^'-'S2^ 

If uj{n) = k > 101og2X, then the number, T(n), of divisors of n satisfies T(n) > 2'^^") > 
(logx)®. Since X]n<z''"('^) ~ xlogx, the number of n < x with uj{n) > lOloggX is 
0(x/log^x). By (II. 7p . the number of n < x with log2X — l3 — A < k < log2X — /? + 1 
is 0{x/ A/log2 x). Finally, suppose k < log2X — /? — 4. The number of n < x for which d'^\n 
for some d > logx is 0{xJ2d>iogx ^f^"^) — 0{x/ logx). If there is no such d, then by (II. 2p . 

k k 

logn < 2 log2 x + Y log Pi < 2 log2 x + Y e^^^"^ < 2 log2 x + 2e''+^~^ < ^ logx, 
i=i i=i 

thus n < ^Jx. This completes the proof of the upper bound in (I1.12| . 

Our methods for proving Theorems [1] and [2] are borrowed from [3|, especially sections 8, 
10 and 12 therein. The tools there are adequate for making precise the heuristic argument 
outlined above when the function / is monotonic in each variable, even if / is discontinuous. 
We provide details only for Theorem [TJ In lower bound for Mfc(x; we may need to fix 

several of the smallest prime factors of n, but otherwise the details of the proof of Theorem 
[2] are very similar. 

2. Certain partitions of the primes 

We describe in this section certain partitions of the primes which will be needed in the 
proof of Theorems [1] and [21 The constructions are similar to those given in §4 and §8 of ^]. 
Let Ao = 1.9 and inductively define Aj to be the largest prime such that 

E \ 

\j-i<p<\j 

In particular, Ai = 3 and A2 = 109. By Mertens' estimate, log2 \j = j + 0(1). Let Gj be the 
set of primes in (Aj__i, Xj] for j > 1. Then there is an absolute constant K so that if p G Gj 
then I log2 p — j\ < K. 



< 1. 

P 
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Next, let Q > and 7 = 1/ \ogQ. If p < Q, then < e, hence < 1 + (e — l)7logp. 
By Merten's estimates, 

= Oil) + E + - 1)^^) = 'o&e + 0(1). 

p<Q p<Q ^ ^ 

/>i 

It follows for an absolute constant K' ^ independent of Q, that the set of primes p < Q may 
be partitioned into at most | logg Q + K' sets Ej so that (i) for each j, 

£^ p/(i-7) - ^ 

p&E, ^ 
/>1 

and (ii) for p E Ej, \ logaP — 2j| < i^''. We stipulate that the above sum is < 2 rather than 
< 1 in order to accomodate the prime 2. 

3. Proof of Theorem [T] upper bound 

Without loss of generality, suppose that k is large, {u + l)w < k/10, and n > x/ logx. We 
have V < 1.1k and consequently a > 1/(1.1^4). Also, by (11. ip . 

k-u 9 

logs Pk>ak- (3 = log2 x>— log2 x. 

V 11 

We may suppose pi \ n, as the number of n < x with p1\n is 0(xexp(— (logx)^/^^)) = 
0{TXk{x)/k). For brevity, write xi = x^l'^\ For some integer i satisfying £ > and 
expexp(Q;A; — /?) < x^, we have x^+i < Pk < xg. With £ fixed, given pi, . . . ,pk-i with 
exponents /i, . . . , the number of possibilities for is 

1—^/2 £ 



«-7 — ^ « 



vx ■■■ pi-i log Xi {p{' ■ ■ ■ pi^zi y~nogx 

where 7 = l/logX£. This follows for ^ >1 from Pi^ ■■■p^kll > x/ {pk\ogx) > x^l'^ . We 
conclude that 

(3.13) iV,(x;a,/?)«-^5^e^-^^^ 7T ITT;— " 

/i,-,/fc-i>i 

Consider the intervals Ej defined in the previous section corresponding to Q = X£. Put 
J = \_^\og2Xi + and define ji, . . . , jfc-i by G -Ej.. Let JT" denote the set of tuples 
(ji, . . . , jfc-i) so that 1 < ji < ■ ■ • < jk-i < J and such that ji > ^{ai — [3 — K' — A) for 
every i. Given pi, . . . ,Pk-i, let 6j be the number of pi in .Ej, for 1 < j < J. The contribution 
to the inner sum of (13.131) from those tuple of primes with a fixed (ji, . . . ,jk-i) is 



„ =1 ^ \eEi, f>i ^ 

< 



/(1-7) 



j=i J ^peEjj>i 
2k-i 



We observe that l/(&i! ■ ■ - ^j!) is the volume of the region (yi, ■ ■ ■ ,yk-i) G M.''^^ satisfying 
< < • ■ ■ < Hk^i < J and ji — 1 < Hi < ji for each i (there are bj numbers i/i in each 
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interval (j — Making the change of variables S,i = Hi/ J and summing over all possible 

vectors (ji, . . . ,jk-i) G JT", we find that the inner sum in fl3.13p is 

< (2J)^-i Vol jo < ^1 < • ■ ■ < e.-i < 1 : > o^^-l3-K^-A-2 < , < ^ _ 

^ (log2 X + 2K')>'-^ fp + K' + A + 2 2J\ 

- (k^^l [ a ' ^ ) 

(loga x)^-^ {u + l)w 
^ (A;-l)! k ' 

where we have used (II .61) . By (13.131) . summing on i and using (11. 7p completes the proof. 

4. Proof of Theorem [T] lower bound 

First, we assume k > 2, since if = 1 then Ni{x;a,j3) = iti{x) + O(logx) trivially as 
A + [3 > a (powers of primes < e""'^ are not counted in Ni{x; a,/?)). Also, we may assume 
that a > 1/2A. If a < 1/2A, then Nk{x;a,/3) > Nk{x; 1/2A, 0) and we prove below that 
Nk{x; 1/2A, 0) > 7Tk{x) (here u = 0, v > 2k and w > k). 

Let T be a sufficiently large constant, depending on e and A, and put 

C = e^'^+^x+io 

We first prove the theorem in the case that 

(4.14) ^a{w-l) _^a{w-2) 

Notice that 

(4.15) aj — P = log2 X — a{w + k — 1 — j). 
In particular, 

ak — (3 = log2 X — a{w — 1) < log2 x — log C. 

Let J = [log2 X — K — log T — 2j . Recall the definition of the numbers \j and sets Gj from 
section [21 Consider squarefree n satisfying (11.11) . with Pk~i ^ -^j and for which 

Vi ■ ■ -Pk-i < x^^"^. 

Also take pk so that x/2 < n < x. Given pi, . . . ,pk-i, the number of possible pk is ^ 
x/(pi ■ ■ ■ Pk~i logx). Put bi = ■ ■ ■ = b-r-i = and for T < j < J, suppose bj < min(T(j — 
T — 1),T(J — j + 1)). Suppose there are exactly bj primes Pi in the set Gj for 1 < j < J. 
By the definition of J, 



k-l fc-l 

^ log Pi < Te^+^ ^ re^-'' < 3Te-^+^ < - logx, 

j=l r=l 

as required. Define the numbers ji by G Gj^ . The inequalities (II. ip will be satisfied if 
(4.16) Ji>ai-p + K {l<i<k-l). 

This is possible since by (I4.14p . 

a{k-l)-f3 = log2 x-aw < log^ x - 2K - 3T - 10 < J - T - 1. 
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With (ji, . . . , jfc-i) fixed (so that bi, . . . ,bj are fixed), the sum of 1/pi ■ ■ -pk-i is 



J 



TT-f T - T y - 

J--'- 6,! V ^ Pi ^ P2 ^ Pb 

P2^P1 P6,0{pi,...,P6,-l} 



~ fL^ ¥ V e^P(i - I- K) 

> 



if T is large enough. The right side is 1/2 of the volume of the region of {yi, ■ ■ ■ , yk-i) G M.^^^ 
satisfying < yi < ■ ■ ■ < yt-i < J — T + 1 and ji — T < yi < ji + 1 — T for each i. Set 
if = J — T + 1. Assume that 

(4.17) jmT+i >T + m, jk~i^rnT < J -m (integers m > 1), 

so that bj < min(r(j — T + 1), T(J — j + 1)) for each j. Making the substitution = yi/H 
and summing over all tuples (ji, ■ ■ ■ ,jk-i) yields 

(4.18) Nk{x;a,(3) » ^Vol(i?) »a ^^^^^^Vol(i?), 

log X log X 

where, by fICTD and KlTh . R is the set of $, satisfying (i) < < ■ • ■ < ^k-i < 1, 
> (a^ — (3 + K — T) I H ioi each i, (ii) ^mr+i > m/if and .^fc-i-mr < 1 — m/H for each 
positive integer m. 

It remains to estimate from below the volume of R. Let 5* be the set of ^ satisfying (i), 
so that 

Qk-l{^^,u) (3 + T-K H 

= -Jk^^ ^ = a ' ' = a- 

li T > K + A, then fi u + 1. By the definition of C and J, if T is large enough then 

fi + u-{k-l) = -{k-l)>w > > 1. 

a a 1 + e 

Hence, by (11.61) . 

(4.19) Vol(5) » 7^^, / = min(l, {u + l)w/k). 

The implied constant in (14.191) does not depend on T, but the inequality does require that 
T be sufficiently large. 

For a positive integer m, let 

Vi{m) = Vol{^ G S : Ut+1 < m/H}, 

V^im) = Vol{^ G S : > 1 - m/H}. 
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We have by (iLGll . 

Vi{m) < Vol{0 < <■■■< 6-1 < 1 : 6 > ^ (^T + 2 < z < A; - 1)} 



•C 

< 



(mT + 1)! {k-2-mT)\ 
(m/ij')'"^+^ + ;/ - (A; - 1)) 

(mT + 1)! {k -mT){k - 2 -mT)\ 

fk{m/H)"'^+^ 
{k - mT){mT + - 2 - mT)! 
/ {km/H)'^^+^ k 



{k-l)\ (mT + 1)! A;-mT 
Since /c/if 1 and r! > [r/eY, it follows from fl4.19p that for large enough T, 

m 

Similarly, 

^ Qfc_2-^r(/i.^) {m/Hr^+' 
'^""^ - (A; - 2 - mT)! (mT + 1)! ' 

By dLSD, 

/ /i(/i + z/-(fc-l)+mT + l) ^^ mTfc/ 
(5fe_2-mr(/i, z^) < mm 1, ^ < 



A; — mT / A; — mT 

Hence, if T is large enough then 

Y,V2{m)<^-Vo\{S). 

m 

We therefore have, for T large enough, 

Vol(i?) > Vol(5) - 5^(^i(m) + V2{m)) >a f ■ 



m>l 



Together with fl4.18p and (11. 7p . this completes the proof under the assumption (14.140 . 
It remains to consider the case 

l + e< e"^"'-^) - e"("'-^) < C. 

Since w > 1 + e and a > 1/2^4, we find that a <t^e,A 1 and w <^£,a 1- Hence, if x is large 
enough, 

A; = u + f — w + l>i; — ty> — . 

~ ~ AA 

Let -B be a large integer depending on e. Suppose that 

(4.20) aj - f3 <\og2Pj <aj - f3 + \og{l + e/2) {k ~ B < j < k - 1) 



GENERALIZED SMIRNOV STATISTICS AND THE DISTRIBUTION OF PRIME FACTORS 

Then, by (Hl5ll . 

k-l 

J2 logPi < (1 + ^/2) (e"""' + e-"("'+^) + ■ ■ ■ + e-"(-+^-i)) log a; 
< (1 + e/2) ( , - e-"^-'"-A log X. 

Assume also that 

k-B-l ^.^ 

(4.21) logp. < ^ — -, — ^logx. 



If in addition ak — [3 < \0g2Pk < ak — {3 + log(l + e/2), then by fll.lOj) . 

£/2 + l + e/2 , 
logn = > logp,- < — -, — -T -, — -r logx < logx, 

as required. Thus, given pi, . . . ,Pfc-i satisfying (14.201) and (14.211) . the number of is ^ 
xj (pi ■ ■ ■ Pfc-i log x). If -B is large enough, there is great flexibility in choosing pi, . . . , pk-B-i, 
since by (14.15^ . 

/ ^ — ga(to— 1) ga(to— 2) ° ' 

which is small compared with the right side of (14.211) . By the same argument used to give a 
lower bound for the sum of l/(pi ■ ■ ■pfc_i) under the assumption (I4.14p . we obtain 

V- 1 /(log;.!)'-"-' 

Also, since k log2 x, we have 

2^ ^e,B 1 >e,A (log^x)^ ^ \ 

„ „ Pk-B---Pk~i [k-iy- 

Pk-B,---,Pk-l 

The proof is again completed by applying (II. 7p . 
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